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Abstract 

Global sensitivity analysis is used to quantify the influence of uncertain input parame- 
ters on the response variability of a numerical model. The common quantitative methods 
are appropriate with computer codes having scalar input variables. This paper aims at 
illustrating different variance-based sensitivity analysis techniques, based on the so-called 
Sobol's indices, when some input variables are functional, such as stochastic processes or 
random spatial fields. In this work, we focus on large cpu time computer codes which need 
a preliminary metamodeling step before performing the sensitivity analysis. We propose 
the use of the joint modeling approach, i.e., modeling simultaneously the mean and the 
dispersion of the code outputs using two interlinked Generalized Linear Models (GLM) 
or Generalized Additive Models (GAM). The "mean model" allows to estimate the sensi- 
tivity indices of each scalar input variables, while the "dispersion model" allows to derive 
the total sensitivity index of the functional input variables. The proposed approach is 
compared to some classical sensitivity analysis methodologies on an analytical function. 
Lastly, the new methodology is applied to an industrial computer code that simulates the 
nuclear fuel irradiation. 

Keywords: Sobol's indices, joint modeling, generalized additive model, metamodel, 
stochastic process, uncertainty 



1 INTRODUCTION 

Modern computer codes that simulate physical phenomenas often take as inputs a high 
number of numerical parameters and physical variables, and return several outputs - 
scalars or functions. For the development and the use of such computer models, Sensitivity 
Analysis (SA) is an invaluable tool. The original technique, based on the derivative 
computations of the model outputs with respect to the model inputs, suffers from strong 
limitations for computer models simulating non-linear phenomena. More recent global 
SA techniques take into account the entire range of variation of the inputs and aim to 
app ortion the whole output uncertainty to the input factor uncertainties (Saltelli et al. 
2l|). The global SA methods can also be used for model calibration, model validation, 
decision making process, i.e., any process where it is useful to know which are the variables 
that mostly contribute to the output variability. 

The common quantitative methods are applicable to computer codes with scalar input 
variables. For example, in the nuclear engineering domain, global SA tools have been 
applied to numerous models where all the uncertain input parameters are modeled by 
random variables, possibly correlated - such as thermal-hydraulic system codes (Marques 



et al. fl3!|), waste storage safety studies (Helton et al. |7|), environmental model of dose 
calculations (looss et al. reactor dosimetry processes (Jacques et al. fill). Recent 

research papers have tried to consider more complex input variables in the global SA 
process, especially in petroleum and environmental studies: 

• Tarantola et al. work on an environmental assessment on soil models that use 
spatially distributed maps affected by random errors. This kind of uncertainty is 
modeled by a spatial random field (following a specified probability distribution), 
simulated at each code run. For the SA, the authors propose to replace the spatial 
input parameter by a "trigger" random parameter ^ that governs the random field 
simulation. For some values of ^, the random field is simulated and for the other 
values, the random field values are put to zero. Therefore, the sensitivity index of ^ 
is used to quantify the influence of the spatial input parameter. 



• Ruffo et al. [18[ evaluate an oil reservoir production using a model that depends 
on different heterogeneous geological media scenarios. These scenarios, which are of 
limited number, are then substituted for a discrete factor (a scenario number) before 
performing the SA. 
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• looss et al. study a groundwater radionuclide migration model which depend 
on several random scalar parameters and on a spatial random field (a geostatisti- 
cal simulation of the hydrogeological layer heterogeneity). The authors propose to 
consider the spatial input parameter as an "uncontrollable" parameter. Therefore, 
they fit on a few simulation results of the computer model a double model, called a 
joint model: the first component models the effects of the scalar parameters while 
the second models the effects of the "uncontrollable" parameter. 

In this paper, we tackle the problem of the global SA for numerical models and when 
some input parameters e are functional. s{u) is a one or multi-dimensional stochastic 
function where u can be spatial coordinates, time scale or any other physical parameters. 
Our work focuses on models that depend on scalar parameter vector X and involve some 
stochastic process simulations or random fields e{u) as input parameters. The computer 
code output Y depends on the realizations of these random functions. These models are 
typically non linear with strong interactions between input parameters. Therefore, we 
concentrate our methodology on the variance based sensitivity indices estimation; that is, 
the so-called Sobol's indices (Sobol 



25|, Sahelh et al. [21]). 
To deal with this situation, a first natural approach consists in using either all the 
discretized values of the input functional parameter e{u) or its decomposition into an 
appropriate basis of orthogonal functions. Then, for all the new scalar parameters related 
to £(m), sensitivity indices are computed. However, in the case of complex functional 
parameters, this approach seems to be rapidly intractable as these parameters cannot 
be represented by a small number of scalar parameters (Tarantola et al. {2?^). More- 
over, when dealing with non physical parameters (for example coefficients of orthogonal 
functions used in the decomposition), sensitivity indices interpretation may be laborious. 
Indeed, most often, physicists would prefer to obtain one global sensitivity index related 
to e{u). Finally, a major drawback for the decomposition approach is related to the un- 
certainty modeling stage. More precisely, this approach needs to specify the probability 
density functions for the coefficients of the decomposition. 

The following section presents three different strategies to compute the Sobol's in- 
dices with functional inputs: (a) the macroparameter method, (b) the "trigger" parameter 
method and (c) the proposed joint modeling approach. Section [3] compares the relevance 
of these three strategies on an analytical example: the WN-Ishigami function. Lastly, the 
proposed approach is illustrated on an industrial computer code simulating fuel irradiation 
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in a nuclear reactor. 



2 COMPUTATIONAL METHODS OF SOBOL'S IN- 
DICES 

First, let us recall some basic notions about Sobol's indices. Let define the model 

/ : KP ^ M 

(1) 

X ^Y = f{X) 

where Y is the code output, X — (Xi, . . . , are p independent inputs, and / is the 
model function. / is considered as a "black box", i.e. a function whose analytical formula- 
tion is unknown. The main idea of the variance-based SA methods is to evaluate how the 
variance of an input or a group of input parameters contributes to the output variance of 
/. These contributions are described using the following sensitivity indices: 

_ Var[E(y|X,)] _ Var [E ^ ^ ^ ^ 

YariY) ' ' Var(y) ^vk - ■ ■ ■ [Z) 

These coefficients, namely the Sobol's indices, can be used for any complex model functions 
/. The second order index Sij expresses the model sensitivity to the interaction between 
the variables Xi and Xj (without the first order effects of Xi and Xj ) , and so on for higher 
orders effects. The interpretation of these indices is natural as all indices lie in [0, 1] and 
their sum is equal to one. The larger an index value is, the greater is the importance of 
the variable or the group of variables related to this index. 

For a model with p inputs, the number of Sobol's indices is 2^ — 1; leading to an 
intractable number of indices as p increases. Thus, to express the overall output sensitivity 
to an input Xi, Homma & Saltelli [sl introduce the total sensitivity index: 



St, = + + E ^^^'^ + . . . = E (3) 

where #z represents all the "non-ordered" subsets of indices containing index i. Thus, 
X); e #i Si is the sum of all the sensitivity indices having i in their index. The estimation 
of these indices (Eqs. ([2]) and ([3])) can be performed by simple Monte-Carlo simulations 
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20|), or by refined sampling designs 



based on independent samples (Sobol [3], Saltelli 
introduced to reduce the number of required model evaluations signijicantly, for instance 
FAST (Saltelli et al. 



23| ) and quasi-random designs (Saltelli et al. 124I). 
Let us now consider a supplementary input parameter which is a functional input 
variable e{u) E M where tt e M'' is a d-dimensional location vector. e{u) is defined by all 
its marginal and joint probability distributions. In this work, it is supposed that random 
function realizations can be simulated. For example, these realizations can be produced 
using geostatistical simulations (Lantuejoul I4I) or stochastic processes simulations (Gen- 
tle [5|). Our model writes now 

Y = f{X,s) (4) 

and in addition to the Sobol's indices related to X , our goal is to derive methods to com- 
pute the sensitivity indices relative to e, i.e., (first order index), St^ (total sensitivity 
index), Sie (second order indices), S'ye, . . . 

2.1 The macroparameter method 

With the macroparameter method, the functional input parameter is not seen as a func- 
tional by the computer code. It is discretized in a potentially large number of values 
(for example several thousands), each of them being an input scalar parameter of the 
computer code. As all these values come from the functional input parameter (which pos- 
seses a specific correlation structure) , they can be considered as an ensemble of correlated 
input parameters. Taking into account correlation between input variables in sensitivity 
analysis has been a challenging problem, recently solved by a few authors (see Da Veiga 
et al. for a recent review). 

One solution, proposed by Jacques et al. llj, to deal with correlated input parameters. 



25|): each group of correlated 



is to consider multi-dimensional sensitivity indices (Sobol 
parameters is considered as a multi-dimensional parameter or macroparameter. One there- 
fore performs a sensitivity analysis by groups of correlated parameters. To estimate Sobol 
indices (first order, second order, . . . , total), a large number of input parameters (corre- 
lated and non correlated) have to be generated. As we know how to generate independent 
samples of a correlated variables group, the simple Monte-Carlo sampling technique can 
be used (Sobol [24J, Saltelli ^Q})- However, more efficient techniques than simple Monte- 
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Carlo (in terms of the required size sample), as FAST or quasi Monte-Carlo which use 
deterministic samples, are prohibited with correlated input variables. 

In our context, this approach, using the simple Monte-Carlo algorithm, seems to be 
relevant as the input functional parameter e{u) can be considered as a single multi- 
dimensional parameter (i.e. a macroparameter) . For instance, the first order Sobol's 
index related to e{u) is defined as previously by 



Var[E {Y\e)] 
Var(r) 



(5) 



A simple way to estimate — D^/D is based on the Sobol 



24| algorithm: 



k=l 

^ = Tr^E/'(^i'^^'^)-/o (6b) 

k=l 
1 ^ 

= TE/(^i'^^fc)/(^i'^^fc)-/o (6c) 



N- . 

k=l 

where (^[.^'')fe=i...Ar and {X'"^'')k=i...N are two independent sets of N simulations of the 
input vector X and {sk)k=i...N is a sample of N realizations of the random function e{u). 
To compute the sensitivity indices Si, the same algorithm is used with two independent 
samples of {£k)k=i...N- In the same way, the total sensitivity index St^ is derived from 
the algorithm of Saltelli 20|. 



The major drawback of this method is that it may be cpu time consuming, mainly 
because of the sampling method. If d is the number of indices to be estimated, the cost 
of the Sobol's algorithm is n = N{d + 1) while the cost of Saltelli's algorithm to estimate 
d first order and d total sensitivity indices is n = N{d + 2). It is well known that, 
for complex computer models, an accurate estimation of Sobol's indices by the simple 
Monte-Carlo method (independent random samples) requires N > 1000, i.e. more than 



221). In complex 



thousand model evaluations for one input parameter (Saltelli et al. 
industrial applications, this approach is intractable due to the cpu time cost of one model 
evaluation and the possible large number of input parameters. 
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2.2 The "trigger" parameter method 

Dealing with spatially distributed input variables, Tarantola et al. [211 propose an alter- 
native that uses an additional scalar input parameter ^ - called the "trigger" parameter. 
^ ^ U[0, 1] governs the random function simulation. More precisely, for each simulation, 
if ^ < 0.5, the functional parameter e{u) is fixed to a nominal value So{u) (for example 
the mean E[e(it)]), while if ^ > 0.5, the functional parameter e{u) is simulated. Using this 
methodology, it is possible to estimate how sensitive the model output is to the presence 
of the random function. Tarantola et al. use the Extended FAST method to com- 
pute the first order and total sensitivity indices of 6 scalar input factors and 2 additional 
"trigger" parameters. For their study, the sensitivity indices according to the "trigger" pa- 
rameters are small and the authors conclude that it is unnecessary to model these spatial 
errors more accurately. 

Contrary to the previous method, there is no restriction about the sensitivity indices 
estimation procedure - i.e. Monte-Carlo, FAST, quasi Monte-Carlo. However, there are 
two major drawbacks for this approach: 

• As the macroparameter method, it also requires the use of the computer model to 
perform the SA and it may be problematic for large cpu time computer models. This 
problem can be compensated by the use of an efficient quasi Monte-Carlo algorithm 
for which the sampling design size can be decreased to iV = 100. 

• As underlined by Tarantola et al. [g?!, ^ reflects only the presence or the absence of 
the stochastic errors on £o(''^)- Therefore, the term Var[E(y|^)] does not quantify 
the contribution of the random function variability to the output variability Var(y). 
We will discuss about the significance of Var[E(y|^)] later, during our analytical 
function application. 

2.3 The joint modehng approach 

To perform a variance-based SA for time consuming computer models, some authors pro- 
pose to approximate the computer code, starting from an initial small-size sampling design, 
by a mathematical function often called response surface or metamodel (Marseguerra et al. 



14|, Volkova et al. [28l|, Fang et al. [3|). This metamodel, requiring negligible cpu time, is 
then used to estimate Sobol's indices by any method, for example the simple Monte-Carlo 
algorithm. For metamodels with sufficient prediction capabilities, the bias between the 
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exact Sobol's indices (from the computer code) and the Sobol's indices estimated via the 
metamodel is neghgible. Indeed, it has been shown that the unexplained variance part of 
the cornputer code by the metamodel (which can be measured) corresponds to this bias 
(sobol m). Several choices of metamodel can be found in the literature: polynomials, 
splines, Gaussian processes, neural networks, . . . The fitting process is often based on 
least squares regression techniques. Thus, for the functional input problem, one strategy 
may be to fit a metamodel with a multi-dimensional scalar parameters representing s(u) 
as an input parameter - i.e. its discretization or its decomposition into an appropriate 
basis. This process would correspond to a metamodeling approach for the macroparame- 
ter method. However, this approach seems to be impracticable due to the potential large 
number of scalar parameters: applying regression techniques supposes to have more obser- 
vation points (simulation sets) than input parameters and important numerical problem 
(like matrix conditioning) might occur while dealing with correlated input parameters. 

A second option is to substitute each random function realization for a discrete number, 
which can correspond to the scenario parameter of Ruffo et al. [isl (where the number 
of geostatistical realizations is finite and fixed, and where each different value of the 
discrete parameter corresponds to a different realization). Then, a metamodel is fitted 
using this dicrete parameter as a qualitative input variable. However, using a metamodel 
is interesting when only a few runs of the code is available, which correponds to a more 
limited number of realizations of the functional input. This restriction of the possible 
realizations of the input random function to a few ones is not appropriate in a general 
context. 

Another strategy considers e{u) as an uncontrollable parameter. A metamodel is fitted 
in function of the other scalar parameters X: 

Y„,{X)=E{Y\X) (7) 

Therefore, using the relation 

Var(r) = Var[E(r |X)] + E[Var(r |X)] (8) 

it can be easily shown that the sensitivity indices of Y given the scalar parameters X = 
{Xi)i^i,,,p write (looss et al. Q) 
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_ Var[E(y„,|XO] _ Var[E(y„|X,X,-)] _ ^ _ ^ 

Var(r) ' Var(r) 

and can be computed by classical Monte-Carlo techniques applied on the metamodel 
Ym- Therefore, using equation ([8]), the total sensitivity index of Y according to e{u) 
corresponds to the expectation of the unexplained part of Var(y) by the metamodel Ym'. 

E[Var(y|X)] 
- Var(y) 

Using this approach, our objective is altered because it is impossible to decompose the 
£ effects into an elementary effect (S^) as well as the interaction effects between e and 
the scalar parameters However, we see below that our technique allows a 

qualitative appraisal of the interaction indices. 

The sensitivity index estimations from equations ([9]) and (|10[) raise two difficulties: 

1. It is well known that classical parametric metamodels (based on least squares fitting) 
are not adapted to estimate E(y |X) accurately due to the presence of heteroscedas- 
ticity (induced by the effect of e). Such cases are analyzed by looss et al. The 
authors show that heteroscedasticity may lead to sensitivity indices misspecifica- 
tions. 

2. Classical non parametric methods, such as Generalized Additive Models (Hastie 
and Tibshirani [6]) and Gaussian processes (Sacks et al. [19|) that can provide 
efficient estimation of E(y|X) (examples are given in looss et al. ^1), even in 
high dimensional input cases {p > 5). However, these approaches are based on a 
homosccdasticity hypothesis and do not enable the estimation of Var(y|X). 

To solve the second problem, Zabalza-Mezghani et al. 
developed for experimental data (McCuUagh and Nelder 15|): the simultaneous fitting of 



3p| propose the use of a theory 



the mean and the dispersion by two interlinked Generalized Linear Models (GLM), which 
is called the joint modeling (see Appendix A.l). Besides, to resolve the first problem, 
this approach has been extended by looss et al. 9] to non parametric models. This 
generalization allows more complexity and flexibility. The authors propose the use of 
Generalized Additive Models (GAMs) based on penalized smoothing splines (Wood 29|). 



A succint description of GAM and joint GAM is given in Appendix A. 2. GAMs allow 
model and variable selections using quasi-likelihood function, statistical tests on coeffi- 
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cients and graphical display. However, compared to other complex metamodels, GAMs 
impose an additive effects hypothesis. Therefore, two metamodels are obtained: one for 
the mean component Ym{X) — E{Y\X); and the other one for the dispersion component 
Y(i{X) ~ Yar{Y\X). The sensitivity indices of X are computed using Ym with the stan- 
dard procedure (Eq. ([5])), while the total sensitivity index of e{u) is computed from E{Yd) 
(Eq. pni ). Using the model for Yd as well as the associated regression diagnostics, it is 
possible to deduce qualitative sensitivity indices for the interactions between e(u) and the 
scalar parameters of X. 

One major assumption of the joint modeling approach is that the "mean response" of 
the computer code is well handled using Ym- Consequently, all the unexplained part of 
the computer model by this metamodel is due to the uncontrollable parameter. In other 
words, the better the mean component metamodel is, the smaller is the influence of the 
uncontrollable parameter. This is a strong assumption which has to be validated in order 
to avoid erroneous results. In fact, some simple statistical and graphical tools can be used 
while fitting the mean component (looss et al. Q]): the explained deviance value, the 
observed responses versus predicted values plot (and its quantilc-quantile plot) and the 
deviance residuals plot. This last plot allows to detect some fitting problems by revealing 
possible biases or large residual values. Some examples are given in section [3?2l These 
tools can also be applied for the dispersion component fit. For a detailed overview of these 
diagnostic tools, one can refer to McCuUagh & Nelder [l^. 

3 APPLICATION TO AN ANALYTICAL EXAMPLE 

The three previously proposed methods are first illustrated on a simple analytical model 
with two scalar input variables and one functional input: 

Y = f{Xi,X2,s{t)) = sin(Xi) + 7sin(X2)2 + 0.1[max(£(t))]4sin(Xi) (11) 

where Xi ^ U[—n;TT] for i — 1,2 and e{t) is a white noise, i.e. an i.i.d. stochastic pro- 
cess s{t) ^ A/'(0, 1). In our model simulations, s{t) is discretized in one hundred values: 
t — 1 . . . 100. The function (jlip is similar to the well-known Ishigami function (Homma 
and Saltelli [Sj]) but substitute the third parameter for the maximum of a stochastic pro- 
cess. Consequently, we call our function the white-noise Ishigami function (WN-Ishigami) . 
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Although the WN-Ishigami function is an analytical model, the introduction of the maxi- 
mum of a stochastic process inside a model is quite realistic. For example, some computer 
models simulating physical phenomena can use the maximum of time-dependent variable 
- river height, rainfall quantity, temperature. Such input variable can be modeled by a 
temporal stochastic process. 

As for the Ishigami function, we can immediately deduce from the formula (jlip : 



Se = Su = S2e = Sue = (12) 

Then, we have 



Sti — Si + Sis, St2 — S2, St^ — S^ (13) 

In the following, we focus our attention on the estimation of Si, S2 and St^. 

Because of a particularly complex probability distribution of the maximum of a white 
noise, there is no analytical solution for the theoretical Sobol's indices Si, S2 and St^ for 
the WN-Ishigami function. Even with the asymptotic hypothesis (number of time steps 
tending to infinity), where the maximum of the white noise follows Generalized Extreme 
Value distribution, theoretical indices are unreachable. Therefore, our benchmark Sobol's 
indices values are derived from the Monte-Carlo method. 



3.1 The macroparameter and "trigger" parameter methods 

Table [1] contains the Sobol's index estimates using the macroparameter and "trigger" 
parameter methods. As explained before, we can only use some algorithms based on 
independent Monte-Carlo samples. We apply the algorithm of Sobol [2^ that computes 
^i, S2, Sie at a cost n = 4iV and the algorithm of Saltelli [20! which computes the first 
order indices 6*1 , iS'2 and the total sensitivity indices Sti , St2 , St^ at a cost n — bN (where 
is the size of the Monte-Carlo samples, cf. section [2TT|) . For the estimation, the size 
of the Monte-Carlo samples is limited to A^ 10000 because of memory computer limit. 
Indeed, the functional input s{u) contains for each simulation set 100 values. Then, the 
input sample matrix has the dimension A^ x 102 which becomes extremely large when A^ 
increases. To evaluate the effect of this limited Monte-Carlo sample size A, each Sobol's 
index estimate is associated to a standard-deviation estimated by bootstrap (Saltelli et al. 



22l |) - with 100 replicates of the input-output sample. The obtained standard-deviations 
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(sd) are relatively small, of the order of 0.01, which is rather sufRcient for our exercise. 

Remark: We have also tried to estimate Sobol's indices with smaller Monte-Carlo 
sample sizes N. The order of the obtained standard- deviations (estimated by bootstrap) of 
the Sobol's estimates are the following: sd - 0.02 for N = 5000, sd - 0.04 for N = 1000 
and sd ~ 0.06 for N ~ 500. We conclude that the Monte-Carlo estimates are sufficiently 
accurate for N > 5000. 

[Table 1 about here.] 

Macroparameter 

For the macroparameter method, the theoretical relations between indices given in l|13p 
are satisfied. We are therefore confident with the estimates obtained with this method and 
we choose the Sobol's indices obtained with Saltelli's algorithm as the reference indices: 

Si = 55.1%, S2 = 20.7%, St^ = 24.8% 

The S'e, 5*12, S2e and Si2e indices (Eq. are not reported in table [1] as estimates are 

negligible. 

Trigger parameter 

Using the "trigger" parameter method, the estimates reported in table [1] are quite far 
from the reference values. The inadequacies are larger than 30% for all the indices, and 
can be larger than 60% for a few ones {S2 and S'ts)- Moreover, the relations given in (|13[) 
are not satisfied at all. Actually, replacing the input parameter e{u) by ^ which governs 
the presence or the absence of the functional input parameter changes the model. When s 
is not simulated, it is replaced by its mean (zero) and the WN-Ishigami function becomes 
Y = sin(Xi) + 7sin(X2)^. Therefore, the mix of the WN-Ishigami model and this new 
model perturbs the estimation of the sensitivity indices, even those unrelated to e (like 
X2). In conclusion, the obtained results are in concordance with the expected results. 

This result confirms our expectation: sensitivity indices derived from the "trigger" 
parameter method have not the same sense that the classical ones, i.e., the measure of 
the contribution of the input parameter variability to the output variable variability. The 
sensitivity indices obtained with these two methods are unconnected because the "trigger" 
parameter method changes the structure of the model. 
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3.2 The joint modeling approach 

We apply now the joint modeling approach which requires an initial input-output sample 
to fit the joint metamodel - the mean component Ym and the dispersion component Yd- 
For our application, a learning sample size of n — 500 was considered; i.e., n independent 
random samples of (Xi, X2, £{u)) were simulated leading to n observations for Y. Joint 
GLM and joint GAM fitting procedures are fully described in looss et al. [9]. Some 
graphical residual analyses are particularly useful to check the relevance of the mean and 
dispersion components of the joint models. In the following, we give the results of the 
joint models fitting on a learning sample {Xi, X2,e(u),Y). Let us recall that we fit a 
model to predict Y in function of {Xi, X2). 

Joint GLM fitting 

For the joint GLM, a fourth order polynomial for the parametric form of the model 
is considered. Moreover, only the explanatory terms are retained in our regression model 
using analysis of deviance and the Fisher statistics. The equation of the mean component 
writes: 

y,„ = 1.77 + 4.75X1 + IMXi - O.blXf - 0.26X1 . (14) 

The value estimates, standard-deviation estimates and Student test results on the regres- 
sion coefficients are given in table [51 Residuals graphical analysis makes it also possible 
to appreciate the model goodness-of-fit. 

[Table 2 about here.] 

The explained deviance of this model is D^xpi — 73%. It can be seen that it remains 
27% of non explained deviance due to the model inadequacy and/or to the functional input 
parameter. The predictivity coefficient, i.e. coefficient of determination B? computed on 
a test sample, is Q2 — 70%. Q2 is relatively coherent with the explained deviance. 

For the dispersion component, using analysis of deviance techniques, none significant 
explanatory variable were found: the heteroscedastic pattern of the data has not been 
retrieved. Thus, the dispersion component is supposed to be constant (see Table [5]); 
and the joint GLM model is equivalent to a simple GLM - but with a different fitting 
procedure. 
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Joint GAM fitting 

At present, we try to model the data using a joint GAM. For eacli component (mean 
and dispersion), Student test for tlie parametric part and Fisher statistics for the non 
parametric part allow us to keep only the explanatory terms (see Table 121) • The resulting 
model is described by the following features: 

= 3.76-5.54Ai + si(Ai) + S2(A2) , ^ ^ 

(15) 

\og{Yd) = 1.05 + Sdi(Ai) , 
where si(-), S2(') and Sdi{-) denote three penalized spline smoothing terms. 

[Table 3 about here.] 

The explained deviance of the mean component is Dexpi = 92% and the predictiv- 
ity coefficient is Q2 = 77%. Therefore, the joint GAM approach outperforms the joint 
GLM one. Indeed, the proportion of explained deviance is clearly greater for the GAM 
model. Even if this is obviously related to an increasing number of parameters; this is 
also explained as GAMs are more flexible than GLMs. This is confirmed by the increase 
of the predictivity coefficient - from 70% to 77%. Moreover, due to the GAMs flexibility, 
the explanatory variable Xi is identified for the dispersion component. The interaction 
between Xi and the functional input parameter e{u) which governs the heteroscedasticity 
of this model is therefore retrieved. 

Figure [1] shows that the deviance residuals for the mean component of the joint GAM 
seem to be more homogeneously dispersed around the x-axis than the deviance residuals 
of the joint GLM. This leads to a better prediction from the joint GAM on the whole 
range of the observations. 

[Figure 1 about here.] 

Sobol's indices 

From the joint GLM and the joint GAM, Sobol's sensitivity indices can be computed 
using equations ([9]) and (fTO)l - see Tabled The reference values are extracted from the 
results of the macroparameter method via Saltelli's algorithm (see Table [T|) and from the 
WN-Ishigami analytical form pT|) (for example we know that 5*12 = because there is 
no interaction between Xi and A2). The standard deviation estimates (sd) are obtained 
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from 100 replicates of the Monte-Carlo estimation procedure - which uses N — 10000 for 
the size of the Monte-Carlo samples (see section [^TT|) . The joint GLM and joint GAM give 
approximately good estimates of 5*1 and 82- Despite the joint GLM leads to an acceptable 
estimation for St^ , we will see later that it is fortuitous. The estimation of St^ with 
the joint GAM seems also satisfactory but not accurate. In fact, an efficient modeling of 
Var(y|X) is difficult, which is a common statistical difficulty in heteroscedastic regression 
problems (Antoniadis & Lavergne (l^). 

Another way to estimate the total sensitivity index St^ is to compute the unexplained 
variance of the mean component model given directly by 1 — Q2, where Q2 is the pre- 
dictivity coefhcient of the mean component model. In practical applications, Q2 can be 
estimated using leave-one-out or cross validation procedures. In our analytical case, the 
index is estimated with the former method and leads to a correct estimation - 0.23 instead 
of 0.25. 

[Table 4 about here.] 

For the other sensitivity indices, the conclusions draw from the GLM formula are 
completely erroneous. As the dispersion component is constant, — St^ = 0.268 while 
Se = in reality. In contrary, the deductions draw from GAM formulas are exact: 
{Xi,e) interaction sensitivity is strictly positive (^le > 0) because Xi is active in the 
dispersion component Yd, S2e — Si2e = 0, — S2 and 5*12 = S23 — S'123 = 0. The 
drawback of this method is that some indices (S'le, and Sti) remain unknown due to 
the non separability of the dispersion component effects. However, we can easily deduce 
some variation intervals which contain these indices: and Si^ are smaller than St^ 
while Si + min(S'ie) l£ Sti i£ Si + max(S'ig). All these additional information provide 
qualitative importance measures for the unknown indices. 

By estimating Sobol's indices with those obtained from other learning samples, we 
observe that the estimates are rather dispersed: it seems that the estimates are not robust 
according to different learning samples for the joint models. To examine this effect, we 
propose to study two different sample sizes {n — 200 and n = 500). For each sample size, 
the distribution of the Sobol's indices estimates is assessed using a bootstrap procedure. 
Figures [2] and [3] show the results of this investigation, which are particularly convincing. 
Several conclusion can be drawn: 

• For the joint GAM, the boxplot interquartile interval of each index contains its 
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reference value. In contrary, the joint GLM fails to obtain correct estimates: except 
for Si , the sensitivity reference values are outside the interquartile intervals of the 
obtained boxplots. 

• The superiority of the joint GAM with respect to the joint GLM is corroborated, 
especially for 5*2 and St^ . 

• The increase of the learning sample size has no effect on the joint GLM results (due 
to the parametric form of this model). However, for the joint GAM, boxplots widths 
are strongly reduced from n = 200 to n = 500. In addition, the mean estimates 
seem to converge to the reference values. 

• As explained before, the estimation of St^ using the predictivity coefficient Q2 is 
markedly better than through the dispersion component model. This is not the 
case for the joint GLM. Moreover, we confirm that the previous result of table IH 
St^ — 0.268, was a good case: with 100 replicates, St^ ranges from 0.24 to 0.35 
(Figure 

[Figure 2 about here.] 

[Figure 3 about here.] 

In conclusion, this example shows that the joint models, and specially the joint GAM, 
can adjust rather complex heteroscedastic situations. Of course, additional tests are 
needed to confirm this result. Moreover, the joint models offer a theoretical basis to 
compute efficiently global sensitivity indices for models with functional input parame- 
ter. Finally, the required number of computer model evaluations is much smaller with 
the joint modeling method (here n = 200 or n = 500 gives good results with the joint 
GAM) compared to the one of Monte-Carlo based techniques. For exemple, using the 
macroparameter method (cf. section 13. ip and taking N = 5000, we need to compute 
n = 25000 model evaluations to estimate first order and total sensitivity indices (via 
Saltelh's algorithm). 
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4 APPLICATION TO A NUCLEAR FUEL IRRADI- 
ATION SIMULATION 



The METEOR computer code, developed within the Fuel Studies Department in CEA 
Cadarache, studies the thermo-mechanical behavior of the fuel rods under irradiation in 
a nuclear reactor core. In particular, it computes the fission gas swelling and the cladding 
creep (Garcia et al. jjl). These two output variables are considered in our analysis. 
These variables are of fundamental importance for the physical understanding of the fuel 
behavior and for the monitoring of the nuclear reactor core. 

Input parameters of such mechanical models can be evaluated either by database analy- 
ses, arguments invoking simplifying hypotheses, expert judgment. All these considerations 
lead to assign to each input parameter a nominal value associated with an uncertainty. 
In this study, six uncertain input parameters are considered: the initial internal pressure 
Xi, the pellet and cladding radius X2, X^, the microstructural fuel grain diameter ^4, 
the fuel porosity X^ and the time-dependent irradiation power P{t). Xi, ^5 are 

all modeled by Gaussian independent random variables with the following coefficient of 
variations: cv(A:i) = 0.019, cv(A:2) = 1.22 x 10"^ cviXs) = 1.05 x lO"^^ cv(A:4) = 0.044, 
cv{X5) — 0.25. The last variable P{t) is a temporal function (discretized in 3558 values) 
and its uncertainty e{t) is modelled like a stochastic process. For simplicity, an Additive 
White Noise (AWN), of uniform law ranging between —5% and -1-5%, was introduced. 

As in the previous application, additionally to its scalar random variables, the model 
includes an input functional variable P(t). To compute Sobol's indices of this model, we 
have first tried to use the macroparameter method. We have succedeed to perform the 
calculations with N ~ 1000 (for the Montc-Garlo sample sizes of Eqs. ([5a|) . (j6b[) and (pc)) ). 
The sensitivity indices estimates have been obtained after 10 computation days and were 
extremely imprecise, with strong variations between and 1. Because of the required 
cpu time, an increase of the sample size TV to obtain acceptable sensitivity estimates was 
unconceivable. Therefore, the goal of this section is to show how the use of the joint 
modeling approach allows to estimate the sensitivity indices of the METEOR model and, 
in particular, to quantify the functional input variable influence. 

500 METEOR calculations were carried out using Monte-Carlo sampling of the input 
parameters (using Latin Hypercube Sampling). As expected, the AWN on P{t) generates 
an increase in the standard deviation of the output variables (compared to simulations 
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without a white noise) : 6% increase for the variable fission gas swelling and 60% for the 
variable cladding creep. 

4.1 Gas swelling 

We start by studying the gas swelling model output. With a joint GLM, the following 
result for Ym and were obtained: 

Y^ = -76 - OAXi + 20X2 + 8X4 + 134X5 + 0.02X3 - 2X2X4 - 6X4X5 

(16) 

\og{Yd) = -2.4X1 

The explained deviance of the mean component is D^xpi — 86%. As the residual analyses 
of mean and dispersion components do not show any biases, the resulting model seems 
satisfactory. The joint GAM was also fitted on these data and led to similar results. Thus, 
it seems that spline terms are useless and that a joint GLM model is appropriate. 

Tablc[5]shows the results for the Sobol's indices estimation using Monte-Garlo methods 
applied on the metamodel (|16p. The standard deviation (sd) estimates are obtained from 
100 replicates of the Monte-Carlo estimation procedure -which uses 10^ model computa- 
tions for one index estimation. It is useless to perform the Monte-Carlo estimation for 
some indices because they can be deduced from the joint model equations. For example, 
53 = (resp. 5^2 = 0) because X3 (res. X2) is not involved in the mean (resp. dis- 
persion) component in equation (|16p . Moreover, we know that Si^ > because Xi is an 
explanatory variable inside the dispersion component Y^. However, this formulation does 
not allow to have any idea about which refiects the first order effect of e. Therefore, 

some indices are not accessible, such as and non distinguishable inside the total 

5 5 

sensitivity index St^ . Finally, we can check that Si + Sij + St^ = 1 holds - 

i—l i,j — l^i<j 

up to numerical approximations. 

It can be seen that X4 (grain diameter) and X5 (fuel porosity) are the most influent 
factors (each one having 40% of influence), and do not interact with the irradiation power 
P{t) (represented by its uncertainty e). In addition, the effect of P{t) is not negligible 
(St^ = 14%) and parameter Xi (internal pressure) acts mainly with its interaction with 
P{t). A sensitivity analysis by fixing Xi could allow us to obtain some information about 
the first order effect of e in the model. 
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4.2 Cladding creep 

We study now the cladding creep model output. With a joint GLM, the model for Ym 
and Yd is: 

Ym =-2.75 + 1.05X2-0.15X3-0.58X5 

(17) 

log{Yd) = 156052 - 76184X2 + 9298X| 

The explained deviance of the mean component is D^,j.pi — 26%. As the residual analyses of 
mean and dispersion components show some biases, the resulting model is not satisfactory. 

For the joint GAM, the spline terms {s(X2), s{X^)^ s(X5)} and s(X2) are added within 
the mean component and the dispersion component respectively. The explained deviance 
of the mean component is D^xpi — 29% which is not significantly greater than 26%. 
However, as the mean component residual biases of the joint GAM are smaller than those 
observed for the joint GLM, the joint GAM seems to be more relevant than the joint 
GLM. 

Table[n]shows the Sobol's index estimates using Monte-Carlo methods and deductions 

from the joint model equations. For the joint GLM and joint GAM of the cladding creep, 

5 5 

5Z S'i + Sij + St^ = 1 holds - up to numerical imprccisions. Due to the proximity 

of the two joint models, results are similar. This analysis shows that the parameter X2 
(pellet radius) explains 28% of the uncertainty of the cladding creep phenomenon, while 
the other scalar parameters have negligible influence. The greater part of the cladding 
creep variance {St^ = 70%) is explained by the irradiation power uncertainty (the AWN). 
Physicists may be interested in quantifying the interaction influence between the pellet 
radius and the irradiation power. Unfortunately, this interaction is not available for the 
moment in our analysis. 

[Table 5 about here.] 

5 CONCLUSION 

This paper has proposed a solution to perform global sensitivity analysis for time consum- 
ing computer models which depend on functional input parameters, such as a stochastic 
process or a random field. Our purpose concerned the computation of variance-based im- 
portance measures of the model output according to the uncertain input parameters. We 
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have discussed a first natural solution which consists in integrating the functional input 
parameter inside a macroparameter, and using standard Monte-Carlo algorithms to com- 
pute sensitivity indices. This solution is not applicable for time consuming computer code. 
We have discussed another solution, used in previous studies, based on the replacement of 
the functional input parameter by a "trigger" parameter that governs the integration or 
not of the functional input uncertainties. However, the estimated sensitivity indices are 
not the expected ones due to changes in the model structure carrying out by the method 
itself. Finally, we have proposed an innovative strategy, the joint modeling method, based 
on a preliminary step of double (and joint) metamodel fitting, which resolves the large 
cpu time problem of Monte-Carlo methods. It consists in rejecting the functional input 
parameters in noisy input variables. Then, two metamodels depending only on the scalar 
random input variables are simultaneously fitted: one for the mean function and one for 
the dispersion (variance) function. 

Tests on an analytical function have shown the relevance of the joint modeling method, 
which provides all the sensitivity indices of the scalar input parameters and the total 
sensitivity index of the functional input parameter. In addition, it reveals in a qualitative 
way the influential interactions between the functional parameter and the scalar input 
parameters. It would be interesting in the future to be able to distinguish the contributions 
of several functional input parameters that are currently totally mixed in one sensitivity 
index. This is the main drawback of the proposed method in its present form. 

In an industrial application, the usefulness and feasibility of our methodology has been 
established. Indeed, other methods are not applicable in this application because of large 
cpu time of the computer code. To a better understanding of the model behavior, the 
information brought by the global sensitivity analysis can be very useful to the physicist 
or the modeling engineer. The joint model can also be useful to propagate uncertainties 
in complex models, containing input random functions, to obtain some mean predictions 
with their confldence intervals. 
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APPENDIX A: JOINT MODELING OF MEAN AND 
DISPERSION 



A.l Joint Generalized Linear Models 

GLMs allow to extend traditional linear models by the use of a distribution which belongs 
to the exponential family and a link function that connects the explanatory variables to 
the explained variable (Nelder & Wedderburn [17]). The joint GLM consists in putting a 
GLM on the mean component of the model and a GLM on the dispersion component of 
the model. 

The mean component is therefore described by: 



where (li)i=i...n are independent random variables with mean /i^; Xij are the observations 
of the parameter Xj] [3j are the regression parameters that have to be estimated; rji is 
the mean linear predictor; g{-) is a differentiable monotonous function (called the link 
function); (pi is the dispersion parameter and v[-) is the variance function. To estimate 
the mean component, the functions g{-) and v{-) have to be specified. Some examples 
of link functions are the identity (traditional linear model), root square, logarithm, and 
inverse functions. Some examples of variance functions are the constant (traditional linear 
model), identity and square functions. 

Within the joint modeling framework, the dispersion parameter (pi is not supposed to 
be constant as in a traditional GLM, but is supposed to vary according to the model: 



(19) 

Var(di) = TVd{4>i) , 

where di is a statistic representative of the dispersion, jj are the regression parameters 




(18) 
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that have to be estimated, h{-) is the dispersion hnk function, Q is the dispersion hnear 
predictor, t is a constant and Vd{-) is the dispersion variance function. Uij are the obser- 
vations of the explanatory variable Uj. The variables (Uj) are generally taken among the 
explanatory variables of the mean (Xj), but can also be different. To ensure positivity, 
a log link function is often chosen for the dispersion component. For the statistic repre- 
senting the dispersion d, the deviance contribution (which is close to the distribution of 
a x^) is considered. Therefore, as the is a particular case of the Gamma distribution, 
Vdiip) — 0^ a-iid T ^ 2. In particular, for the Gaussian case, these relations are exact: d is 
distributed and r = 2. 

The joint model is fitted using Extended Quasi-Loglikelihood (EQL) (Nelder & Pregi- 
bon [l^) maximization. The EQL behaves as a log-likelihood for both mean and dispersion 
parameters. This justifies an iterative procedure to fit the joint model. Statistical tools 
available in the GLM fitting are also available for each component of the joint model: de- 
viance analysis. Student and Fisher tests, residuals graphical analysis. It allows to make 
some variable selection in order to simplify model expressions. 



A. 2 Joint Generalized Additive Models 

Generalized Additive models (GAM) allow the linear term in the linear predictor 77 = 
J2j PjXj of equation to be replaced by a sum of smooth fmictions rj = J^j 
(Hastie & Tibshirani |6l|). The Sj(.)'s are unspecified functions that are obtained by fitting 
a smoother to the data, in an iterative procedure. GAMs provide a flexible method for 
identifying nonlinear covariate effects in exponential family models and other likelihood- 
based regression models. The fitting of GAM introduces an extra level of iteration in 
which each spline is fitted in turn assuming the others known. GAM terms can be mixed 
quite generally with GLM terms in deriving a model. 

One common choice for Sj are the smoothing splines, i.e. splines with knots at each 
distinct value of the variables. In regression problems, smoothing splines have to be pe- 
nalized in order to avoid data overfitting. Wood [29| has described in details how GAMs 



can be constructed using penalized regression splines. This approach is particularly ap- 
propriate as it allows the integrated model selection using Generalized Cross Validation 
(GCV) and related criteria, the incorporation of multi-dimensional smooths and relatively 
well founded inference using the resulting models. Because numerical models often ex- 
hibit strong interactions between input parameters, the incorporation of multi-dimensional 
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smooth (for example the bi-dimensional sphne term Sij{Xi, Xj)) is particularly important 
in our context. 

GAMs are generally fitted using penalized likelihood maximization. For this purpose, 
the likelihood is modified by the addition of a penalty for each smooth function, penalizing 
its "wiggliness". Namely, the penalized loglikelihood (PL) is defined as: 



2 

dxj (20) 



where L is the loglikelihood function, p is the total number of smooth terms and Aj are 
"tuning" constants that compromise between goodness of fit and smoothness. Estimation 
of these "tuning" constants is generally achieved using the GCV score minimization (Wood 



291). 



We have seen that GAMs extend in a natural way GLMs. looss et al. [9| have shown 
how to extend joint GLM to joint GAM. Extension of PL to penalized extended quasi- 
likelihood (PEQL) is straightforward by substituting the likelihood function PL and the 
deviance d for their extended quasi counterparts. The fitting procedure of the joint GAM 
is similar to the one of joint GLM. 
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List of Figures 



Deviance residuals for the joint GLM and the Joint GAM versus the 
fitted values (WN-Ishigami application). Dashed lines correspond 

to local polynomial smoothers [27 

WN-Ishigami application. Comparison of Sobol's indices estimates 
for the learning sample size: n = 200. For each index, the horizontal 

line is the reference value [28 

WN-Ishigami application. Comparison of Sobol's indices estimates 
for the learning sample size: n = 500. For each index, the horizontal 
line is the reference value 2S 
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Joint GAM 




Figure 1: Deviance residuals for the joint GLM and the Joint GAM versus the fitted values 
(WN-Ishigami application). Dashed lines correspond to local polynomial smoothers. 
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Figure 2: WN-Ishigami application. Comparison of Sobol's indices estimates for the learning 
sample size: n = 200. For each index, the horizontal line is the reference value. 
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Figure 3: WN-Ishigami application. Comparison of Sobol's indices estimates for the learning 
sample size: n = 500. For each index, the horizontal line is the reference value. 
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List of Tables 



1 Sobol's sensitivity indices (with standard deviations sd) obtained 
from two Monte-Carlo algorithms (Sobol [24] and Saltelli and 
two integration methods of the functional input e (macroparame- 
ter and "trigger" parameter) on the WN-Ishigami function. " — " 
indicates that the value is not available 

2 For the WN-Ishigami function, summary results of the joint GLM 
fitting, for the mean component Ym and the dispersion component 
Y^. Estimate standard errors as well as statistics and p-values for 
the Student's test are reported 

3 For the WN-Ishigami function, summary results of the joint GAM 
fitting, for the mean component Ym and the dispersion component 
Y^. Estimate standard errors as well as statistics and p-values for 
the Student's test are reported. For the smoothing splines, the 
estimated degree of freedom (edf), the rank of the smoother and 
the statistics and p-values for the null hypotheses that each smooth 
term is zero are reported 

4 Sobol's sensitivity indices (with standard deviations) for the WN- 
Ishigami function: exact and estimated values from joint GLM and 
joint GAM (fitted with a 500-size sample). "Method" indicates the 
estimation method: MC for the Monte-Carlo procedure, Eq for a 
deduction from the model equations and Q2 for the deduction of 
the predictivity coefficient Q2- " — " indicates that the value is not 
available 

5 Sobol's sensitivity indices (with standard deviations sd) from joint 
models fitted on the outputs of the METEOR code. "Method" in- 
dicates the estimation method: MC for the Monte-Carlo procedure 
and Eq for a deduction from the joint model equation. " — " indi- 
cates that the value is not available 
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Table 1: Sobol's sensitivity indices (with standard deviations sd) obtained from two Monte- 
Carlo algorithms (Sobol 2j| and Saltelli [20]) and two integration methods of the functional 
input £ (macroparameter and "trigger" parameter) on the WN-Ishigami function. " — " indi- 
cates that the value is not available. 



Macroparameter "Trigger" parameter 



Indices 


Sobol 


s algo 


Saltelh 


algo 


Sobol's 


algo 


Saltelh 


algo 


Values 


sd 


Values 


sd 


Values 


sd 


Values 


sd 


Si 


0.540 


1.3e-2 


0.551 


1.6e-2 


0.304 


1.3e-2 


0.330 


1.8e-2 


Sti 






0.808 


2.0e-2 






0.656 


1.4e-2 


S2 


0.197 


l.le-2 


0.207 


0.8e-2 


0.329 


1.4e-2 


0.348 


1.5e-2 








0.212 


0.7e-3 






0.532 


1.3e-2 


Sle 


0.268 


2.4e-2 






0.177 


2.2e-2 












0.248 


1.3e-2 






0.336 


1.4e-2 
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Table 2: For the WN-Ishigami function, summary results of the joint GLM fitting, for the 
mean component and the dispersion component Y^. Estimate standard errors as well as 
statistics and p- values for the Student's test are reported. 





Estimate 


Std. Error 


t- value 


Pr(>|t|) 


(Intercept) 


1.77495 


0.22436 


7.911 


1.68e-14 


Xi 


4.75219 


0.16283 


29.186 


< 2c-16 


XI 


1.99965 


0.14331 


13.953 


< 2e-16 


xf 


-0.51254 


0.02479 


-20.679 


< 2e-16 


XI 


-0.25952 


0.01657 


-15.659 


< 2e-16 


log{Yd) 




Estimate 


Std. Error 


t- value 


Pr(>|t|) 


(Intercept) 


1.9652 


0.1373 


14.32 


<2e-16 
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Table 3: For the WN-Ishigami function, summary results of the joint GAM fitting, for the 
mean component and the dispersion component Y^. Estimate standard errors as well as 
statistics and p-values for the Student's test are reported. For the smoothing splines, the 
estimated degree of freedom (edf), the rank of the smoother and the statistics and p-values 
for the null hypotheses that each smooth term is zero are reported. 



Y 





Estimate 


Std. Error 


t- value 


Pr(>|t|) 


(Intercept) 


3.76439 


0.09288 


40.53 


<2e-16 


Xi 


-5.53920 


0.33607 


-16.48 


<2e-16 




edf 


Est.rank 


F 


p- value 


si{Xi) 


5.656 


8 


151.1 


<2e-16 




8.597 


9 


411.4 


<2e-16 


log{Yd) 




Estimate 


Std. Error 


t- value 


Pr(>|t|) 


(Intercept) 


1.05088 


0.07885 


13.33 


<2e-16 




edf 


Est.rank 


F 


p- value 


Sdi{X^) 


8.781 


9 


36.09 


<2e-16 
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Table 4: Sobol's sensitivity indices (with standard deviations) for the WN-Ishigami function: 
exact and estimated values from joint GLM and joint GAM (fitted with a 500-size sample). 

"Method" indicates the estimation method: MC for the Monte-Carlo procedure, Eq for a 
deduction from the model equations and Q2 for the deduction of the predictivity coefficient 
Q2- " — " indicates that the vahic is not available. 



Indices 


Reference 


Joint GLM 


Joint GAM 




Values 


sd 


Values 


sd 


Method 


Values 


sd 


Method 


Si 


0.551 


1.6e-2 


0.580 


3e-3 


MC 


0.554 


4e-3 


MC 


S2 


0.207 


0.8e-2 


0.181 


7e-3 


MC 


0.228 


6e-3 


MC 




0.248 


1.3e-2 


0.268 


le-3 


MC 


0.218 


le-3 


MC 








0.30 




Q2 


0.23 




Q2 


S12 












Eq 







Eq 


Sle 


0.248 


1.3e-2 







Eq 


]0,0.23] 




Eq 


S2e 












Eq 







Eq 


Sl2e 












Eq 







Eq 


Sti 


0.808 


2.0e-2 


0.580 


3e-3 


Eq 


]0.554, 0.784] 




Eq 


St2 


0.212 


0.7e-3 


0.181 


7e-3 


Eq 


0.228 


6e-3 


Eq 


Se 







0.268 


le-3 


Eq 


[0, 0.23] 




Eq 
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Table 5: Sobol's sensitivity indices (with standard deviations sd) from joint models fitted on 
the outputs of the METEOR code. "Method" indicates the estimation method: MC for the 
Monte-Carlo procedure and Eq for a deduction from the joint model equation. " — " indicates 
that the value is not available. 





Gas 


swelling 








Cladding 


; creep 






Indices 


Joint GLM 




Joint GLM 




Joint GAM 






Values 


sd 


Method 


Values 


sd 


Method 


Values 


sd 


Method 


Si 


0.029 


6e-3 


MC 


0.000 


le-3 


MC 


0.000 


lc-3 


MC 


S2 


0.024 


5e-3 


MC 


0.294 


le-4 


MC 


0.282 


2c-4 


MC 


S3 







Eq 


0.006 


le-3 


MC 


0.007 


le-3 


MC 


S4 


0.394 


5e-3 


MC 


0.000 


le-3 


MC 


0.000 


le-3 


MC 


S5 


0.409 


6e-3 


MC 


0.006 


le-3 


MC 


0.006 


le-3 


MC 


S2A 


0.002 


5e-3 


MC 







Eq 







Eq 


S45 


0.000 


9e-3 


MC 







Eq 







Eq 


other Sij 







Eq 







Eq 







Eq 


Sts 


0.143 


le-4 


MC 


0.694 


le-4 


MC 


0.704 


3e-4 


MC 


Se 


[0,0.143] 






[0, 0.694] 






[0, 0.704] 






Sle 


]0, 0.143] 











Eq 







Eq 


S2e 







Eq 


]0, 0.694] 






]0, 0.704] 






other Sie 







Eq 







Eq 







Eq 


Sti 


]0.029, 0.172] 






0.000 


le-3 


Eq 


0.000 


4e-3 


Eq 


St2 


0.026 


7e-3 


Eq 


]0.294, 0.988] 






]0.282, 0.986] 






St3 







Eq 


0.006 


le-3 


Eq 


0.007 


4e-3 


Eq 


St4, 


0.396 


7e-3 


Eq 


0.000 


le-3 


Eq 


0.000 


4e-3 


Eq 


Sts 


0.409 


0.011 


Eq 


0.006 


le-3 


Eq 


0.006 


4e-3 


Eq 
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